-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add Unitree G1 Walking Demo #359
base: main
Are you sure you want to change the base?
Conversation
Cool! @ziyanx02 can you review this? |
Actually I borrowed some reward functions from https://github.com/unitreerobotics/unitree_rl_gym, which is under BSD-3 License. So I added another commit to mention the source in each function's docstring. |
Looks impressive!
|
Thanks! For the 2 questions.
I have pushed latest code with more reward functions. |
Latest video after training for 64 seconds: file_v3_00i0_34a51ae6-ceb3-4127-875a-1896355191ag.mp4 |
The training code does not include domain randomization or observation noise, which might explain why it requires a much shorter training time. However, the current results show no sign of possibly successful deployment. Improving the motion and trying sim-to-sim transfer or directly deploying it to the real world would be better. |
I totally understand your consideration. Because it's hard for me to acquire a G1 hardware, I think the most feasible way for me to verify the policy is Sim2Sim. Also, I'll see how to make it walk better. Thanks. |
…d numerical instability problem
I wonder why it was closed? I have the G1 hardware and am willing to try it out, after sim-to-sim and additional randomization. |
Hi @erwincoumans! I closed this PR because I recently returned to university and was worried I wouldn't have enough time to continue working on it. However, I’ve already added domain randomization and observation noise to my forked repo, as well as another MuJoCo environment to test the trained policy. While the policy works well in Genesis after training, it crashes in MuJoCo for reasons I haven’t been able to identify. The Sim2Sim transfer isn’t functioning as expected. I’m happy to reopen the issue, and if you’re able to test this pipeline on a real robot, that would be amazing! |
I developed a similar version. I can PR it if anyone is interested. It uses the full dof G1. I also did the same with the H1. Here's a video: https://bsky.app/profile/miguelalonsojr.bsky.social/post/3lez5qcpe5k26 |
Hi @miguelalonsojr, I think we have the same problem: it is not enough to only make it work in Genesis. We have to prove the policy can be transferred to another simulator (Sim2Sim) or the real robot (Sim2Real). This usually requires some techniques such as domain randomization and observation noise. |
@0nhc I think it's ok not to model realistic observations or add domain randomization to have a humanoid walking example in the repo. The quadruped example, for instance, doesn't model sensor noise nor domain randomization. I think these are just starter examples and not really meant to be completely deployable. I wouldn't expect to be able to deploy a locomotion policy directly from the examples. |
@miguelalonsojr can you please create a PR for the H1 as well ? |
I think it's cool to have another humanoid RL demo, so I added an RL demo for Unitree G1 walking in this PR. Here's my modification:
Added Unitree G1's urdf under assets/urdf, with original BSD-3 License.
Added 3 Python scripts (g1_env.py, g1_train.py, g1_eval.py) under examples/locomotion.
Demo video after training for 69.5 seconds on my PC:
file_v3_00i0_64cebb24-16ec-4d67-970d-02ea2dd42dcg.mp4